add structured error types and retry logic for 429/5xx by ayaan-kapoor · Pull Request #7 · gumloop/gumloop-py

ayaan-kapoor · 2026-06-10T18:49:33Z

What

Add specific APIStatusError subclasses dispatched by status code:
BadRequestError (400), PermissionDeniedError (403), NotFoundError (404),
UnprocessableEntityError (422), RateLimitError (429), ServerError (5xx)
Add retry logic with exponential backoff to HttpClient and AsyncHttpClient
Surface max_retries on Gumloop and AsyncGumloop constructors (default: 2)

Why

Right now everything non-2xx raises a generic APIStatusError, so callers
have to inspect .status_code manually to know what went wrong. Specific
subclasses make error handling a lot cleaner — you can just catch
RateLimitError or NotFoundError directly.

The retry logic handles transient 429s and 5xx failures automatically.
Only server-side errors are retried, never 4xx. Respects Retry-After
when present, otherwise falls back to exponential backoff with jitter.
max_retries=0 disables it entirely if needed.

Tests

18 new tests in tests/sdk/test_errors.py.

greptile-apps · 2026-06-10T18:52:51Z

Greptile Summary

This PR adds a typed error hierarchy (BadRequestError, PermissionDeniedError, NotFoundError, UnprocessableEntityError, RateLimitError, ServerError) dispatched by HTTP status code, and exponential-backoff retry logic with Retry-After support to both sync and async HTTP clients.

Error dispatch: to_api_error now returns the most-specific subclass via a _STATUS_MAP lookup, falling through to ServerError for any 5xx and to the base APIStatusError for unmapped 4xx codes.
Retry logic: _should_retry correctly blocks 5xx retries on non-idempotent methods (POST, PATCH) while allowing 429 retries on all methods; _parse_retry_after handles both integer-seconds and RFC 7231 HTTP-date formats.
Public surface: max_retries (default 2) is surfaced on both Gumloop and AsyncGumloop constructors and forwarded to the underlying transport clients.

Confidence Score: 5/5

Safe to merge — the retry and error-dispatch logic is correct, idempotency guards work as described, and Retry-After HTTP-date parsing is properly implemented.

The core changes are well-structured: the idempotency guard correctly prevents POST/PATCH 5xx retries, _parse_retry_after handles both header formats, and the error class hierarchy is cleanly dispatched. The only findings are a misleading backoff comment and an async test that skips patching sleep — neither affects runtime correctness.

No files require special attention.

Important Files Changed

Filename	Overview
src/gumloop/errors.py	Adds six typed error subclasses and a `_STATUS_MAP` dispatch table. `to_api_error` correctly falls through to `ServerError` for any 5xx not in the map, and to the base `APIStatusError` for unknown 4xx codes.
src/gumloop/_http.py	Introduces retry helpers (`_should_retry`, `_retry_delay`, `_parse_retry_after`) and wraps both `_request` and `post_to_stream_host` in a retry loop. POST/PATCH 5xx are correctly blocked from retrying; `Retry-After` HTTP-date format is now handled. A misleading module-level comment describes the backoff formula incorrectly.
src/gumloop/_client.py	Threads `max_retries` through both `Gumloop` and `AsyncGumloop` constructors and forwards it to `HttpClient` / `AsyncHttpClient`. Straightforward plumbing with no issues.
src/gumloop/init.py	Re-exports the five new error classes from the public API surface. `__all__` is correctly updated.
tests/sdk/test_errors.py	18 new tests covering error dispatch, sync/async retry, idempotency guards, and `Retry-After` parsing. `test_async_retries_on_500_then_succeeds` does not patch `asyncio.sleep`, so it will actually sleep for up to 0.5 s non-deterministically.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[HTTP request via _request / post_to_stream_host] --> B{status < 400?}
    B -- yes --> C[Return response]
    B -- no --> D[to_api_error typed subclass]
    D --> E{_should_retry?}
    E -- no: 4xx non-429, or 5xx on POST/PATCH --> F[raise exc]
    E -- yes: 429 or 5xx on idempotent method --> G{attempt < max_retries?}
    G -- no --> F
    G -- yes --> H[_parse_retry_after]
    H --> I{Retry-After header?}
    I -- seconds string --> J[use parsed float]
    I -- HTTP-date string --> K[compute delta from now]
    I -- absent --> L[exponential backoff with full jitter]
    J & K & L --> M[sleep / asyncio.sleep]
    M --> A

_{Reviews (2): Last reviewed commit: "address review: idempotent-only 5xx retr..." | Re-trigger Greptile}

greptile-apps · 2026-06-10T18:53:03Z

+        for attempt in range(self._max_retries + 1):
+            response = self._client.request(method, path, headers=headers, **kwargs)
+            if response.status_code < 400:
+                return response.json() if response.content else None
+            exc = to_api_error(response)
+            if attempt < self._max_retries and _should_retry(exc):
+                delay = _retry_delay(attempt, _parse_retry_after(response))
+                logger.debug("retrying %s %s (attempt %d, delay %.2fs)", method, path, attempt + 1, delay)
+                time.sleep(delay)
+                continue
+            raise exc


Non-idempotent POST/PATCH retried on 5xx

_request is used by post() and patch() as well as get() and delete(), so the retry loop applies to all of them equally. A 5xx can be returned by a proxy or load balancer after the upstream already committed the write — retrying will then execute the mutation a second time. For example, client.agents.create(...) hitting a 502 from a gateway would create the agent twice before the caller sees any error. The retry should be limited to idempotent methods (GET, DELETE, HEAD) or guarded by a method check.

greptile-apps · 2026-06-10T18:53:04Z

+def _parse_retry_after(response: httpx.Response) -> float | None:
+    raw = response.headers.get("retry-after")
+    if raw is None:
+        return None
+    try:
+        return float(raw)
+    except ValueError:
+        return None


Retry-After HTTP-date format is silently dropped

RFC 7231 allows Retry-After to be either a delay-in-seconds (Retry-After: 30) or an HTTP-date (Retry-After: Wed, 21 Oct 2015 07:28:00 GMT). When the header contains a date string, float(raw) raises ValueError, the function returns None, and the code falls back to exponential backoff — which for attempt=0 is at most 0.5 s. If the server sent a date meaning "wait 30 s", the client will retry far too soon, burn through all retries, and still surface a RateLimitError to the caller while likely worsening the rate-limit situation on the server side.

greptile-apps · 2026-06-10T18:53:05Z

@@ -184,7 +227,7 @@ def stream_typed(
                    yield response_model.model_validate_json(event.data)
                except ValidationError:
                    # Server-side mid-stream error frames or schema-drift events
-                    # land here. 
+                    # land here.
                    logger.debug("dropped non-%s SSE: %s", response_model.__name__, event.data)
                    continue



stream and stream_typed have no retry logic

Both HttpClient.stream / HttpClient.stream_typed (and their async counterparts) still use the original single-attempt pattern. A 429 or 5xx on a streaming request will raise immediately, while the same status on a non-streaming request through _request or post_to_stream_host will be retried up to max_retries times. Callers who expect consistent retry behaviour regardless of whether they use the streaming or non-streaming variant will be surprised by this inconsistency.

greptile-apps · 2026-06-10T18:53:06Z

+@respx.mock
+def test_retries_on_500_then_succeeds(client: Gumloop) -> None:
+    route = respx.get(f"{API_BASE}/agents").mock(
+        side_effect=[
+            httpx.Response(500, json={"error": {"message": "internal error"}}),
+            httpx.Response(200, json={"agents": []}),
+        ]
+    )
+
+    result = client.agents.list()
+
+    assert result.agents == []
+    assert route.call_count == 2


Test performs a real time.sleep during retry

test_retries_on_500_then_succeeds does not patch time.sleep, so the first retry will actually block for random.uniform(0, 0.5) seconds. While jitter can make this 0, it is non-deterministic and will occasionally slow down the test suite. All other retry tests that care about timing explicitly patch time.sleep — this one should too for consistency.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…leep patch in tests

add structured error types and retry logic for 429/5xx

78b666e

greptile-apps Bot reviewed Jun 10, 2026

View reviewed changes

address review: idempotent-only 5xx retries, HTTP-date Retry-After, s…

b656683

…leep patch in tests

ayaan-kapoor mentioned this pull request Jun 10, 2026

add retry logic to stream() and stream_typed() on initial connection #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add structured error types and retry logic for 429/5xx#7

add structured error types and retry logic for 429/5xx#7
ayaan-kapoor wants to merge 2 commits into
gumloop:mainfrom
ayaan-kapoor:feat/retry-and-error-hierarchy

ayaan-kapoor commented Jun 10, 2026

Uh oh!

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

greptile-apps Bot Jun 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

ayaan-kapoor commented Jun 10, 2026

What

Why

Tests

Uh oh!

greptile-apps Bot commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Jun 10, 2026 •

edited

Loading